Search CORE

31 research outputs found

Query-guided End-to-End Person Search

Author: Amin Sikandar
Galasso Fabio
Munjal Bharti
Tombari Federico
Publication venue
Publication date: 01/01/2019
Field of study

Person search has recently gained attention as the novel task of finding a person, provided as a cropped sample, from a gallery of non-cropped images, whereby several other people are also visible. We believe that i. person detection and re-identification should be pursued in a joint optimization framework and that ii. the person search should leverage the query image extensively (e.g. emphasizing unique query patterns). However, so far, no prior art realizes this. We introduce a novel query-guided end-to-end person search network (QEEPS) to address both aspects. We leverage a most recent joint detector and re-identification work, OIM [37]. We extend this with i. a query-guided Siamese squeeze-and-excitation network (QSSE-Net) that uses global context from both the query and gallery images, ii. a query-guided region proposal network (QRPN) to produce query-relevant proposals, and iii. a query-guided similarity subnetwork (QSimNet), to learn a query-guided reidentification score. QEEPS is the first end-to-end query-guided detection and re-id network. On both the most recent CUHK-SYSU [37] and PRW [46] datasets, we outperform the previous state-of-the-art by a large margin.Comment: Accepted as poster in CVPR 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Temperature Dependence of a Sub-wavelength Compact Graphene Plasmon-Slot Modulator

Author: Amin Rubab
Khan Sikandar
Ma Zhizhen
Sorger Volker J.
Tahersima Mohammad
Publication venue
Publication date: 31/08/2017
Field of study

We investigate a plasmonic electro-optic modulator with an extinction ratio exceeding 1 dB/um by engineering the optical mode to be in-plane with the graphene layer, and show how lowering the operating temperature enables steeper switching. We show how cooling Graphene enables steeping thus improving dynamic energy consumption. Further, we show that multi-layer Graphene integrated with a plasmonic slot waveguide allows for in-plane electric field components, and 3-dB device lengths as short as several hundred nanometers only. Compact modulators approaching electronic length-scales pave a way for ultra-dense photonic integrated circuits with smallest footprint

arXiv.org e-Print Archive

Crossref

Class interference regularization

Author: Bharti Munjal
Fabio Galasso
Sikandar Amin
Publication venue
Publication date: 01/01/2020
Field of study

Contrastive losses yield state-of-the-art performance for person re-identification, face verification and few shot learning. They have recently outperformed the cross-entropy loss on classification at the ImageNet scale and outperformed all self-supervision prior results by a large margin (SimCLR). Simple and effective regularization techniques such as label smoothing and self-distillation do not apply anymore, because they act on multinomial label distributions, adopted in cross-entropy losses, and not on tuple comparative terms, which characterize the contrastive losses. Here we propose a novel, simple and effective regularization technique, the Class Interference Regularization (CIR), which applies to cross-entropy losses but is especially effective on contrastive losses. CIR perturbs the output features by randomly moving them towards the average embeddings of the negative classes. To the best of our knowledge, CIR is the first regularization technique to act on the output features. In experimental evaluation, the combination of CIR and a plain Siamese-net with triplet loss yields best few-shot learning performance on the challenging tieredImageNet. CIR also improves the state-of-the-art technique in person re-identification on the Market-1501 dataset, based on triplet loss, and the state-of-the-art technique in person search on the CUHK-SYSU dataset, based on a cross-entropy loss. Finally, on the task of classification CIR performs on par with the popular label smoothing, as demonstrated for CIFAR-10 and -100

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Joint Detection and Tracking in Videos with Identification Features

Author: Aftab Abdul Rafey
Amin Sikandar
Brandlmaier Meltem D.
Galasso Fabio
Munjal Bharti
Tombari Federico
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Recent works have shown that combining object detection and tracking tasks, in the case of video data, results in higher performance for both tasks, but they require a high frame-rate as a strict requirement for performance. This is assumption is often violated in real-world applications, when models run on embedded devices, often at only a few frames per second. Videos at low frame-rate suffer from large object displacements. Here re-identification features may support to match large-displaced object detections, but current joint detection and re-identification formulations degrade the detector performance, as these two are contrasting tasks. In the real-world application having separate detector and re-id models is often not feasible, as both the memory and runtime effectively double. Towards robust long-term tracking applicable to reduced-computational-power devices, we propose the first joint optimization of detection, tracking and re-identification features for videos. Notably, our joint optimization maintains the detector performance, a typical multi-task challenge. At inference time, we leverage detections for tracking (tracking-by-detection) when the objects are visible, detectable and slowly moving in the image. We leverage instead re-identification features to match objects which disappeared (e.g. due to occlusion) for several frames or were not tracked due to fast motion (or low-frame-rate videos). Our proposed method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.Comment: Accepted at Image and Vision Computing Journa

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Coherent Multi-Sentence Video Description with Variable Level of Detail

Author: Amin Sikandar
Andriluka Mykhaylo
Friedrich Annemarie
Pinkal Manfred
Qiu Wei
Rohrbach Marcus
Schiele Bernt
Senina Anna
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Humans can easily describe what they see in a coherent way and at varying level of detail. However, existing approaches for automatic video description are mainly focused on single sentence generation and produce descriptions at a fixed level of detail. In this paper, we address both of these limitations: for a variable level of detail we produce coherent multi-sentence descriptions of complex videos. We follow a two-step approach where we first learn to predict a semantic representation (SR) from video and then generate natural language descriptions from the SR. To produce consistent multi-sentence descriptions, we model across-sentence consistency at the level of the SR by enforcing a consistent topic. We also contribute both to the visual recognition of objects proposing a hand-centric approach as well as to the robust generation of sentences using a word lattice. Human judges rate our multi-sentence descriptions as more readable, correct, and relevant than related work. To understand the difference between more detailed and shorter descriptions, we collect and analyze a video description corpus of three levels of detail.Comment: 10 page

arXiv.org e-Print Archive

MPG.PuRe

Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets

Author: Amin Sikandar
Belagiannis Vasileios
Cristani Marco
Del Bue Alessio
Galasso Fabio
Hasan Irtiza
Setti Francesco
Tsesmelis Theodore
Publication venue
Publication date: 01/01/2019
Field of study

In this work, we explore the correlation between people trajectories and their head orientations. We argue that people trajectory and head pose forecasting can be modelled as a joint problem. Recent approaches on trajectory forecasting leverage short-term trajectories (aka tracklets) of pedestrians to predict their future paths. In addition, sociological cues, such as expected destination or pedestrian interaction, are often combined with tracklets. In this paper, we propose MiXing-LSTM (MX-LSTM) to capture the interplay between positions and head orientations (vislets) thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. We additionally exploit the head orientations as a proxy for the visual attention, when modeling social interactions. MX-LSTM predicts future pedestrians location and head pose, increasing the standard capabilities of the current approaches on long-term trajectory forecasting. Compared to the state-of-the-art, our approach shows better performances on an extensive set of public benchmarks. MX-LSTM is particularly effective when people move slowly, i.e. the most challenging scenario for all other models. The proposed approach also allows for accurate predictions on a longer time horizon.Comment: Accepted at IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019. arXiv admin note: text overlap with arXiv:1805.0065

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza

Substantial and sustained reduction in under-5 mortality, diarrhea, and pneumonia in Oshikhandass, Pakistan : Evidence from two longitudinal cohort studies 15 years apart

Author: Ahmed Israr
Ahmed K.
Azam S. I.
Baker J. M.
Bano Iqbal
Bano Iqbal
Bano Mehtab
Bano Mobina
Bano Mukhi
Bano Naseem
Bano Parveen
Bano Zohra
Beg Faheemullah
Begum Nasima
Begum Nazara
Begum Zohra
Durrani Sahrish
Fatima Kaniz
Ghazala
Goodwin Julie
Hakeem Mehwish
Hansen C. L.
Hashmani Farah
Hulbert Kristen
Hussain Arif
Hussain Asif
Hussain E.
Jabeen Nusrat
Jahan A.
Jahan Bulbul
Jamison A. F.
Jan Ahmed
Jan Gulab
Jan Gulshan
Jan Resham
Jan Zevar
Jibran Mirza
Keswani Zeenat
Khan Arif Amin
Khan Azad Wali
Khan Sher Baz
Knobler S. L.
McCormick B. J.J.
Meherbano
Meri Malika
Mughal Mumtaz
Mughal Shaheen
Mughal Shamsuddin
Mukhi Aftab
Nasreen Gul
Nazara
Parveen Sajida
Rahim Musa
Rani Alia
Rasmussen Z. A.
Roz Dil
Roz Dil
Rubina
Sameena
Samji N.
Shah W. H.
Shah Wilayat
Sikandar Faran
Spiro D. J.
Sultana Razia
Sunaira
Thomas E. D.
Viboud C.
Wasim Saba
Zadi Malika
Zaidi Anita
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/05/2020
Field of study

Funding Information: Study 1 was funded through the Applied Diarrheal Disease Research Program at Harvard Institute for International Development with a grant from USAID (Project 936–5952, Cooperative Agreement # DPE-5952-A-00-5073-00), and the Aga Khan Health Service, Northern Areas and Chitral, Pakistan. Study 2 was funded by the Pakistan US S&T Cooperative Agreement between the Pakistan Higher Education Commission (HEC) (No.4–421/PAK-US/HEC/2010/955, grant to the Karakoram International University) and US National Academies of Science (Grant Number PGA-P211012 from NAS to the Fogarty International Center). The funding bodies had no role in the design of the study, data collection, analysis, interpretation, or writing of the manuscript. Publisher Copyright: © 2020 The Author(s).Peer reviewedPublisher PD

Aberdeen University Research

Capturing road users on a traffic route

Author: Amin Sikandar
Galasso Fabio
Kaestle Herbert
Publication venue
Publication date: 01/01/2017
Field of study

Die Erfindung betrifft ein Verfahren (10) zum Erfassen von Verkehrsteilnehmern (12) auf einem Verkehrsweg (14) in einer Abbildung, umfassend: - Erzeugen (42) einer Vielzahl von Bereichsvorschlägen (18) für mögliche Objekte, die in der Abbildung (16) aufgezeichnet sind, durch Anwenden eines Bereichsvorschlaggenerators; - Bereitstellen von Objekterfassung (72) für alle Bereichsvorschläge (18), um den Verkehrsweg (14) und/oder die Verkehrsteilnehmer (12) durch Klassifizieren unter Berücksichtigung eines vordefinierten Vertrauensniveaus zu erfassen; - Ausgeben von Erfassungsdaten, die durch die Objekterfassung empfangen werden; und - Bereitstellen eines Filterns (48) für die Bereichsvorschläge (18) vor dem Schritt des Bereitstellens von Objekterfassung, wobei das Filtern basierend auf jeweiligen Filterdaten ausgeführt wird, die basierend auf einer Relevanz der Bereichsvorschläge (18) in Zusammenhang mit den Verkehrsteilnehmern (12) und/oder dem Verkehrsweg (14) geschätzt werden

Archivio della ricerca- Università di Roma La Sapienza